We conducted analysis on GitHub users' geographic location, which was scraped from their public user profile. In our analysis we are focusing on the GitHub users who has contributed in a licensed R package. While 77.23% geographic information (country and city name) were missing, we were still able to map 488104 GitHub users in our final analyses.

Forming city code

The goal of forming city code is to understand the city-levl network. This will enable us to answer questions like what's the most influential city? Is this city more likely to form domestic collaborations or international collaborations? Our city code is a combination of country name, city name and geocode (latitude and longitude). Geocode is critical to differentiate two city within the same country.You will be surprised to see how many Greenvilles are there in the U.S. Meanwhile, the original geocode is accurate to the sixth digit, users located in the same city might have slightly different gocode. We propose a geocode cleaning and aggreating process:
Cleaning: round the geocode (latitude+longitude) to a whole number.
Aggregating: group the cities with geocode that are within 2 degree difference together.

Top Continents Where GitHub Users are Located

North America is the continent with the highest number of GitHub users - followed by Europe and Asia. Later, we breakdown our network into six networks for each continent and investigate collaborations within network and across networks.

Top Countries Where GitHub Users are Located

United States has the highest number of GitHub users (N=166,140) with nearly four times as much as China (N=38,391). The differences are much smaller among United Kingdom, India, Russia, and the rest of the countries in the following figure.

Where Are GitHub Users Scattered in the U.S?

The U.S. has the largest number of GitHub users. They are mostly located in the west and east coast, having California, New Jersey, Washington, Texas, and Massachusetts being the top five states. Sectoring analysis would provide more insights on the reasons GitHub users are located in these states. This also ties to the technical industry being heavily centered in these areas.

Top Cities Where GitHub Users are Located

San Francisco (US) is the city that has the highest number of GitHub users - followed by London (UK), New York (US), Moscow (Russia), and Beijing (China).

City-Level Interactive Map of GitHub Users

In this map, the size of the circle indicates how many GitHub users reported living or working in each city. You can zoom in and hover over the circle to see how many users are in that city. You can also type in the city that you are interested in the search .